Automatic Classification of Users’ Health Information Need Context: Logistic Regression Analysis of Mouse-Click and Eye-Tracker Data
نویسندگان
چکیده
BACKGROUND Users searching for health information on the Internet may be searching for their own health issue, searching for someone else's health issue, or browsing with no particular health issue in mind. Previous research has found that these three categories of users focus on different types of health information. However, most health information websites provide static content for all users. If the three types of user health information need contexts can be identified by the Web application, the search results or information offered to the user can be customized to increase its relevance or usefulness to the user. OBJECTIVE The aim of this study was to investigate the possibility of identifying the three user health information contexts (searching for self, searching for others, or browsing with no particular health issue in mind) using just hyperlink clicking behavior; using eye-tracking information; and using a combination of eye-tracking, demographic, and urgency information. Predictive models are developed using multinomial logistic regression. METHODS A total of 74 participants (39 females and 35 males) who were mainly staff and students of a university were asked to browse a health discussion forum, Healthboards.com. An eye tracker recorded their examining (eye fixation) and skimming (quick eye movement) behaviors on 2 types of screens: summary result screen displaying a list of post headers, and detailed post screen. The following three types of predictive models were developed using logistic regression analysis: model 1 used only the time spent in scanning the summary result screen and reading the detailed post screen, which can be determined from the user's mouse clicks; model 2 used the examining and skimming durations on each screen, recorded by an eye tracker; and model 3 added user demographic and urgency information to model 2. RESULTS An analysis of variance (ANOVA) analysis found that users' browsing durations were significantly different for the three health information contexts (P<.001). The logistic regression model 3 was able to predict the user's type of health information context with a 10-fold cross validation mean accuracy of 84% (62/74), followed by model 2 at 73% (54/74) and model 1 at 71% (52/78). In addition, correlation analysis found that particular browsing durations were highly correlated with users' age, education level, and the urgency of their information need. CONCLUSIONS A user's type of health information need context (ie, searching for self, for others, or with no health issue in mind) can be identified with reasonable accuracy using just user mouse clicks that can easily be detected by Web applications. Higher accuracy can be obtained using Google glass or future computing devices with eye tracking function.
منابع مشابه
Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis
Background: Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predi...
متن کاملPredicting The Type of Malaria Using Classification and Regression Decision Trees
Predicting The Type of Malaria Using Classification and Regression Decision Trees Maryam Ashoori1 *, Fatemeh Hamzavi2 1School of Technical and Engineering, Higher Educational Complex of Saravan, Saravan, Iran 2School of Agriculture, Higher Educational Complex of Saravan, Saravan, Iran Abstract Background: Malaria is an infectious disease infecting 200 - 300 million people annually. Environme...
متن کاملUsing Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media
Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...
متن کاملDevelopment of an Automatic Land Use Extraction System in Urban Areas using VHR Aerial Imagery and GIS Vector Data
Lack of detailed land use (LU) information and efficient data collection methods have made the modeling of urban systems difficult. This study aims to develop a novel hierarchical rule-based LU extraction framework using geographic vector and remotely sensed (RS) data, in order to extract detailed subzonal LU information, residential LU in this study. The LU extraction system is developed to ex...
متن کاملDiscovering Popular Clicks\' Pattern of Teen Users for Query Recommendation
Search engines are still the most important gates for information search in internet. In this regard, providing the best response in the shortest time possible to the user's request is still desired. Normally, search engines are designed for adults and few policies have been employed considering teen users. Teen users are more biased in clicking the results list than are adult users. This leads...
متن کامل